Learning DNF by Decision Trees
نویسنده
چکیده
We investigate the problem of learning D N F concepts f rom examples using decision trees as a concept description language. Due to the replication problem, D N F concepts do not always have a concise decision tree descrip tion when the tests at the nodes are l im ited to the in i t ia l at t r ibutes. However, the representational complexity may be overcome by using high level at tr ibutes as tests. We present a novel a lgor i thm that modifies the in i t ial bias determined by the pr imi t ive attr ibutes by adaptively enlarging the at t r ibute set wi th high level at tr ibutes. We show empirically that this a lgor i thm outperforms a standard decision tree algor i thm for learning small random D N F wi th and wi thout noise, when the examples are drawn f rom the uni form dist r ibut ion.
منابع مشابه
Tight Bounds on Proper Equivalence Query Learning of DNF
We prove a new structural lemma for partial Boolean functions f , which we call the seed lemma for DNF. Using the lemma, we give the first subexponential algorithm for proper learning of poly(n)-term DNF in Angluin’s Equivalence Query (EQ) model. The algorithm has time and query complexity 2 √ , which is optimal. We also give a new result on certificates for DNF-size, a simple algorithm for pro...
متن کاملThe complexity of properly learning simple concept classes
We consider the complexity of properly learning concept classes, i.e. when the learner must output a hypothesis of the same form as the unknown concept. We present the following new upper and lower bounds on well-known concept classes: • We show that unless NP = RP, there is no polynomial-time PAC learning algorithm for DNF formulas where the hypothesis is an OR-of-thresholds. Note that as spec...
متن کاملCOMS 6253 : Advanced
Previously: • Administrative basics, introduction and high-level overview • Concept classes and the relationships among them: DNF formulas, decision trees, decision lists, linear and polynomial threshold functions. • The Probably Approximately Correct (PAC) learning model. • PAC learning linear threshold functions in poly(n, 1/ , log 1/δ) time • PAC learning polynomial threshold functions. Toda...
متن کاملDecision Trees: More Theoretical Justification
We study impurity-based decision tree algorithms such as CART, C4.5, etc., so as to better understand their theoretical underpinnings. We consider such algorithms on special forms of functions and distributions. We deal with the uniform distribution and functions that can be described as unate functions, linear threshold functions and readonce DNF. For unate functions we show that that maximal ...
متن کاملTEL-AVIV UNIVERSITY RAYMOND AND BEVERLY SACKLER FACULTY OF EXACT SCIENCES SCHOOL OF COMPUTER SCIENCE Decision Trees: More Theoretical Justification for Practical Algorithms
We study impurity-based decision tree algorithms such as CART, C4.5, etc., so as to better understand their theoretical underpinnings. We consider such algorithms on special forms of functions and distributions. We deal with the uniform distribution and functions that can be described as a boolean linear threshold functions and a read-once DNF. We show that for boolean linear threshold function...
متن کاملLearning DNF from Random Walks
We consider a model of learning Boolean functions from examples generated by a uniform random walk on {0, 1}n. We give a polynomial time algorithm for learning decision trees and DNF formulas in this model. This is the first efficient algorithm for learning these classes in a natural passive learning model where the learner has no influence over the choice of examples used for learning. Support...
متن کامل